AITopics | dropout method

Collaborating Authors

dropout method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uncertainty Quantification for Physics-Informed Neural Networks with Extended Fiducial Inference

Neural Information Processing SystemsJun-12-2026, 16:22:37 GMT

Uncertainty quantification (UQ) in scientific machine learning is increasingly critical as neural networks are widely adopted to tackle complex problems across diverse scientific disciplines. For physics-informed neural networks (PINNs), a prominent model in scientific machine learning, uncertainty is typically quantified using Bayesian or dropout methods. However, both approaches suffer from a fundamental limitation: the prior distribution or dropout rate required to construct honest confidence sets cannot be determined without additional information. In this paper, we propose a novel method within the framework of extended fiducial inference (EFI) to provide rigorous uncertainty quantification for PINNs. The proposed method leverages a narrow-neck hyper-network to learn the parameters of the PINN and quantify their uncertainty based on imputed random errors in the observations. This approach overcomes the limitations of Bayesian and dropout methods, enabling the construction of honest confidence sets based solely on observed data. This advancement represents a significant breakthrough for PINNs, greatly enhancing their reliability, interpretability, and applicability to real-world scientific and engineering challenges. Moreover, it establishes a new theoretical framework for EFI, extending its application to large-scale models, eliminating the need for sparse hyper-networks, and significantly improving the automaticity and robustness of statistical inference.

artificial intelligence, machine learning, proceedings, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Reviews: Improved Dropout for Shallow and Deep Learning

Neural Information Processing SystemsJan-20-2025, 13:45:35 GMT

TECHNICAL QUALITY It is discussed in Section 1 (lines 31-33) that features with low/zero variance can be dropped more frequently or even completely. How is this intuition supported by the theoretical analysis in Section 4 (particularly Eq. (8) or (9))? Note that the features are not automatically zero-meaned. The uniform dropout scheme described in line 135 (as a special case of multinomial dropout when all the sampling probabilities are equal) is only similar but not identical to standard dropout (the sampling probabilities for different features are not i.i.d.). It may not be good to use it in the experiments as if it is indeed the standard dropout scheme.

dropout, experiment, shallow and deep learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback

FlexiDrop: Theoretical Insights and Practical Advances in Random Dropout Method on GNNs

Zhou, Zhiheng, Liu, Sihao, Zhao, Weichen

arXiv.org Artificial IntelligenceMay-30-2024

Graph Neural Networks (GNNs) are powerful tools for handling graph-type data. Recently, GNNs have been widely applied in various domains, but they also face some issues, such as overfitting, over-smoothing and non-robustness. The existing research indicates that random dropout methods are an effective way to address these issues. However, random dropout methods in GNNs still face unresolved problems. Currently, the choice of dropout rate, often determined by heuristic or grid search methods, can increase the generalization error, contradicting the principal aims of dropout. In this paper, we propose a novel random dropout method for GNNs called FlexiDrop. First, we conduct a theoretical analysis of dropout in GNNs using rademacher complexity and demonstrate that the generalization error of traditional random dropout methods is constrained by a function related to the dropout rate. Subsequently, we use this function as a regularizer to unify the dropout rate and empirical loss within a single loss function, optimizing them simultaneously. Therefore, our method enables adaptive adjustment of the dropout rate and theoretically balances the trade-off between model complexity and generalization ability. Furthermore, extensive experimental results on benchmark datasets show that FlexiDrop outperforms traditional random dropout methods in GNNs.

dropout method, dropout rate, neural network, (14 more...)

arXiv.org Artificial Intelligence

2405.20012

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

LoRA Meets Dropout under a Unified Framework

Wang, Sheng, Chen, Liheng, Jiang, Jiyue, Xue, Boyang, Kong, Lingpeng, Wu, Chuan

arXiv.org Artificial IntelligenceMay-26-2024

With the remarkable capabilities, large language models (LLMs) have emerged as essential elements in numerous NLP applications, while parameter-efficient finetuning, especially LoRA, has gained popularity as a lightweight approach for model customization. Meanwhile, various dropout methods, initially designed for full finetuning with all the parameters updated, alleviates overfitting associated with excessive parameter redundancy. Hence, a possible contradiction arises from negligible trainable parameters of LoRA and the effectiveness of previous dropout methods, which has been largely overlooked. To fill this gap, we first confirm that parameter-efficient LoRA is also overfitting-prone. We then revisit transformer-specific dropout methods, and establish their equivalence and distinctions mathematically and empirically. Building upon this comparative analysis, we introduce a unified framework for a comprehensive investigation, which instantiates these methods based on dropping position, structural pattern and compensation measure. Through this framework, we reveal the new preferences and performance comparisons of them when involved with limited trainable parameters. This framework also allows us to amalgamate the most favorable aspects into a novel dropout method named HiddenKey. Extensive experiments verify the remarkable superiority and sufficiency of HiddenKey across multiple models and tasks, which highlights it as the preferred approach for high-performance and parameter-efficient finetuning of LLMs.

dropattention, dropout method, hiddenkey, (13 more...)

arXiv.org Artificial Intelligence

2403.00812

Country:

Asia > China > Hong Kong (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

FedSPU: Personalized Federated Learning for Resource-constrained Devices with Stochastic Parameter Update

Niu, Ziru, Dong, Hai, Qin, A. K.

arXiv.org Artificial IntelligenceMar-18-2024

Personalized Federated Learning (PFL) is widely employed in IoT applications to handle high-volume, non-iid client data while ensuring data privacy. However, heterogeneous edge devices owned by clients may impose varying degrees of resource constraints, causing computation and communication bottlenecks for PFL. Federated Dropout has emerged as a popular strategy to address this challenge, wherein only a subset of the global model, i.e. a \textit{sub-model}, is trained on a client's device, thereby reducing computation and communication overheads. Nevertheless, the dropout-based model-pruning strategy may introduce bias, particularly towards non-iid local data. When biased sub-models absorb highly divergent parameters from other clients, performance degradation becomes inevitable. In response, we propose federated learning with stochastic parameter update (FedSPU). Unlike dropout that tailors the global model to small-size local sub-models, FedSPU maintains the full model architecture on each device but randomly freezes a certain percentage of neurons in the local model during training while updating the remaining neurons. This approach ensures that a portion of the local model remains personalized, thereby enhancing the model's robustness against biased parameters from other clients. Experimental results demonstrate that FedSPU outperforms federated dropout by 7.57\% on average in terms of accuracy. Furthermore, an introduced early stopping scheme leads to a significant reduction of the training time by \(24.8\%\sim70.4\%\) while maintaining high accuracy.

federated learning, fedspu, global model, (14 more...)

arXiv.org Artificial Intelligence

2403.11464

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.93)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Soft-Dropout: A Practical Approach for Mitigating Overfitting in Quantum Convolutional Neural Networks

Shinde, Aakash Ravindra, Jain, Charu, Kalev, Amir

arXiv.org Artificial IntelligenceSep-4-2023

Quantum convolutional neural network (QCNN), an early application for quantum computers in the NISQ era, has been consistently proven successful as a machine learning (ML) algorithm for several tasks with significant accuracy. Derived from its classical counterpart, QCNN is prone to overfitting. Overfitting is a typical shortcoming of ML models that are trained too closely to the availed training dataset and perform relatively poorly on unseen datasets for a similar problem. In this work we study the adaptation of one of the most successful overfitting mitigation method, knows as the (post-training) dropout method, to the quantum setting. We find that a straightforward implementation of this method in the quantum setting leads to a significant and undesirable consequence: a substantial decrease in success probability of the QCNN. We argue that this effect exposes the crucial role of entanglement in QCNNs and the vulnerability of QCNNs to entanglement loss. To handle overfitting, we proposed a softer version of the dropout method. We find that the proposed method allows us to handle successfully overfitting in the test cases.

accuracy, dataset, qcnn, (15 more...)

arXiv.org Artificial Intelligence

2309.01829

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Is it conceivable that neurogenesis, neural Darwinism, and species evolution could all serve as inspiration for the creation of evolutionary deep neural networks?

Al-Rawi, Mohammed

arXiv.org Artificial IntelligenceApr-11-2023

Deep Neural Networks (DNNs) are built using artificial neural networks. They are part of machine learning methods that are capable of learning from data that have been used in a wide range of applications. DNNs are mainly handcrafted and they usually contain numerous layers. Research frontier has emerged that concerns automated construction of DNNs via evolutionary algorithms. This paper emphasizes the importance of what we call two-dimensional brain evolution and how it can inspire two dimensional DNN evolutionary modeling. We also highlight the connection between the dropout method which is widely-used in regularizing DNNs and neurogenesis of the brain, and how these concepts could benefit DNNs evolution.The paper concludes with several recommendations for enhancing the automatic construction of DNNs.

artificial intelligence, evolution, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2304.03122

Country:

Africa > South Africa (0.05)
Oceania > Australia (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Overfitting in quantum machine learning and entangling dropout

Kobayashi, Masahiro, Nakaji, Kouhei, Yamamoto, Naoki

arXiv.org Artificial IntelligenceJan-21-2023

The ultimate goal in machine learning is to construct a model function that has a generalization capability for unseen dataset, based on given training dataset. If the model function has too much expressibility power, then it may overfit to the training data and as a result lose the generalization capability. To avoid such overfitting issue, several techniques have been developed in the classical machine learning regime, and the dropout is one such effective method. This paper proposes a straightforward analogue of this technique in the quantum machine learning regime, the entangling dropout, meaning that some entangling gates in a given parametrized quantum circuit are randomly removed during the training process to reduce the expressibility of the circuit. Some simple case studies are given to show that this technique actually suppresses the overfitting.

artificial intelligence, dropout, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s42484-022-00087-9

2205.11446

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
Europe > Poland (0.04)
Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improving performance of aircraft detection in satellite imagery while limiting the labelling effort: Hybrid active learning

Imbert, Julie, Dashyan, Gohar, Goupilleau, Alex, Ceillier, Tugdual, Corbineau, Marie-Caroline

arXiv.org Artificial IntelligenceFeb-10-2022

The earth observation industry provides satellite imagery with high spatial resolution and short revisit time. To allow efficient operational employment of these images, automating certain tasks has become necessary. In the defense domain, aircraft detection on satellite imagery is a valuable tool for analysts. Obtaining high performance detectors on such a task can only be achieved by leveraging deep learning and thus us-ing a large amount of labeled data. To obtain labels of a high enough quality, the knowledge of military experts is needed.We propose a hybrid clustering active learning method to select the most relevant data to label, thus limiting the amount of data required and further improving the performances. It combines diversity- and uncertainty-based active learning selection methods. For aircraft detection by segmentation, we show that this method can provide better or competitive results compared to other active learning methods.

active learning, aircraft detection, learning, (11 more...)

arXiv.org Artificial Intelligence

2202.0489

Country:

North America > United States > Hawaii (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Sensor selection on graphs via data-driven node sub-sampling in network time series

Jiang, Yiye, Bigot, Jérémie, Maabout, Sofian

arXiv.org Machine LearningApr-24-2020

This paper is concerned by the problem of selecting an optimal sampling set of sensors over a network of time series for the purpose of signal recovery at non-observed sensors with a minimal reconstruction error. The problem is motivated by applications where time-dependent graph signals are collected over redundant networks. In this setting, one may wish to only use a subset of sensors to predict data streams over the whole collection of nodes in the underlying graph. A typical application is the possibility to reduce the power consumption in a network of sensors that may have limited battery supplies. We propose and compare various data-driven strategies to turn off a fixed number of sensors or equivalently to select a sampling set of nodes. We also relate our approach to the existing literature on sensor selection from multivariate data with a (possibly) underlying graph structure. Our methodology combines tools from multivariate time series analysis, graph signal processing, statistical learning in high-dimension and deep learning. To illustrate the performances of our approach, we report numerical experiments on the analysis of real data from bike sharing networks in different cities.

dropout method, reconstruction error, sensor, (14 more...)

arXiv.org Machine Learning

2004.11815

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
Oceania > New Zealand (0.04)
North America > United States > New York (0.04)

Genre:

Research Report (0.81)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback